limitation and future work
Dynamic Grained Encoder for Vision Transformers A Limitation and Future Work
The budget for DGE is set to 0.5. "Resolution" refers to the side length of input images. As shown in Figure 1(a), one limitation of our work is that the acceleration ratio on GPUs (based on native PyTorch implementation) is not good when the input image size is small. The metric in [64] is used to measure the gating scores in each DGE layer. As shown in Tab. 5 and Tab.
A Limitations and future work We believe that the
All real-world datasets analysed consist of sequence reads of the same part of the genome. This is a widespread set-up for sequence analysis but not ubiquitous. In this project, we work with edit distances between sequences, these are too expensive for large-scale analysis, but it is feasible to produce a large enough training set. We describe here the methods that are most closely related to our work. However, these are bound to a quadratic complexity w.r.t. the length of the input sequence, the best algorithm [ Experiments were also run on synthetic datasets formed by sequences randomly generated.
Improving Human-Autonomous Vehicle Interaction in Complex Systems
Unresolved questions about how autonomous vehicles (AVs) should meet the informational needs of riders hinder real-world adoption. Complicating our ability to satisfy rider needs is that different people, goals, and driving contexts have different criteria for what constitutes interaction success. Unfortunately, most human-AV research and design today treats all people and situations uniformly. It is crucial to understand how an AV should communicate to meet rider needs, and how communications should change when the human-AV complex system changes. I argue that understanding the relationships between different aspects of the human-AV system can help us build improved and adaptable AV communications. I support this argument using three empirical studies. First, I identify optimal communication strategies that enhance driving performance, confidence, and trust for learning in extreme driving environments. Findings highlight the need for task-sensitive, modality-appropriate communications tuned to learner cognitive limits and goals. Next, I highlight the consequences of deploying faulty communication systems and demonstrate the need for context-sensitive communications. Third, I use machine learning (ML) to illuminate personal factors predicting trust in AVs, emphasizing the importance of tailoring designs to individual traits and concerns. Together, this dissertation supports the necessity of transparent, adaptable, and personalized AV systems that cater to individual needs, goals, and contextual demands. By considering the complex system within which human-AV interactions occur, we can deliver valuable insights for designers, researchers, and policymakers. This dissertation also provides a concrete domain to study theories of human-machine joint action and situational awareness, and can be used to guide future human-AI interaction research. [shortened for arxiv]
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Ohio (0.04)
- (10 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- (2 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- (10 more...)
Causality for Natural Language Processing
In the field of natural language processing (NLP), the capability to infer and reason about causality is increasingly recognized as a critical component of intelligent systems. Despite the recent advancement of large language models (LLMs) (Radford et al., 2019; Devlin et al., 2019; Brown et al., 2020; Zhang et al., 2022; OpenAI, 2023; Ignat et al., 2024, inter alia), a key question still remains: Can these models understand and reason about causality? This is a critical skill before we can trust AI agents to be integrated into decision-making systems. Moreover, even if LLMs succeed at some extent of reasoning, they still lack transparency of how their decisions are made, forming a strong need for interpretabil-ity (Luo and Specia, 2024; Räuker et al., 2023; Zou et al., 2023). T o bridge the gap, this thesis explores various facets of causal reasoning in LLMs. W e present a series of studies that collectively advance the knowledge of how well these models perform causal reasoning (Part I), how their decisions are made (Part II), how causality among learning variables influences NLP tasks (Part III), and how causality and NLP can together analyze social problems (Part IV). Below we introduce an overview of the four parts and their corresponding chapters.
- North America > United States > Washington > King County > Seattle (0.27)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.13)
- (71 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
Causal Evaluation of Language Models
Chen, Sirui, Peng, Bo, Chen, Meiqi, Wang, Ruiqi, Xu, Mengying, Zeng, Xingyu, Zhao, Rui, Zhao, Shengjie, Qiao, Yu, Lu, Chaochao
Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive benchmark for evaluating the causal reasoning capabilities of language models. First, we propose the CaLM framework, which establishes a foundational taxonomy consisting of four modules: causal target (i.e., what to evaluate), adaptation (i.e., how to obtain the results), metric (i.e., how to measure the results), and error (i.e., how to analyze the bad results). This taxonomy defines a broad evaluation design space while systematically selecting criteria and priorities. Second, we compose the CaLM dataset, comprising 126,334 data samples, to provide curated sets of causal targets, adaptations, metrics, and errors, offering extensive coverage for diverse research pursuits. Third, we conduct an extensive evaluation of 28 leading language models on a core set of 92 causal targets, 9 adaptations, 7 metrics, and 12 error types. Fourth, we perform detailed analyses of the evaluation results across various dimensions (e.g., adaptation, scale). Fifth, we present 50 high-level empirical findings across 9 dimensions (e.g., model), providing valuable guidance for future language model development. Finally, we develop a multifaceted platform, including a website, leaderboards, datasets, and toolkits, to support scalable and adaptable assessments. We envision CaLM as an ever-evolving benchmark for the community, systematically updated with new causal targets, adaptations, models, metrics, and error types to reflect ongoing research advancements. Project website is at https://opencausalab.github.io/CaLM.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (9 more...)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Epidemiology (1.00)
- (9 more...)
Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks
Recently large language models (LLMs) like ChatGPT have shown impressive performance on many natural language processing tasks with zero-shot. In this paper, we investigate the effectiveness of zero-shot LLMs in the financial domain. We compare the performance of ChatGPT along with some open-source generative LLMs in zero-shot mode with RoBERTa fine-tuned on annotated data. We address three inter-related research questions on data annotation, performance gaps, and the feasibility of employing generative models in the finance domain. Our findings demonstrate that ChatGPT performs well even without labeled data but fine-tuned models generally outperform it. Our research also highlights how annotating with generative models can be time-intensive. Our codebase is publicly available on GitHub under CC BY-NC 4.0 license.
Learning Preferences for Interactive Autonomy
When robots enter everyday human environments, they need to understand their tasks and how they should perform those tasks. To encode these, reward functions, which specify the objective of a robot, are employed. However, designing reward functions can be extremely challenging for complex tasks and environments. A promising approach is to learn reward functions from humans. Recently, several robot learning works embrace this approach and leverage human demonstrations to learn the reward functions. Known as inverse reinforcement learning, this approach relies on a fundamental assumption: humans can provide near-optimal demonstrations to the robot. Unfortunately, this is rarely the case: human demonstrations to the robot are often suboptimal due to various reasons, e.g., difficulty of teleoperation, robot having high degrees of freedom, or humans' cognitive limitations. This thesis is an attempt towards learning reward functions from human users by using other, more reliable data modalities. Specifically, we study how reward functions can be learned using comparative feedback, in which the human user compares multiple robot trajectories instead of (or in addition to) providing demonstrations. To this end, we first propose various forms of comparative feedback, e.g., pairwise comparisons, best-of-many choices, rankings, scaled comparisons; and describe how a robot can use these various forms of human feedback to infer a reward function, which may be parametric or non-parametric. Next, we propose active learning techniques to enable the robot to ask for comparison feedback that optimizes for the expected information that will be gained from that user feedback. Finally, we demonstrate the applicability of our methods in a wide variety of domains, ranging from autonomous driving simulations to home robotics, from standard reinforcement learning benchmarks to lower-body exoskeletons.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Transportation > Ground > Road (1.00)
- Health & Medicine (1.00)
- Automobiles & Trucks (1.00)